Search CORE

92 research outputs found

Detection of Malicious and Low Throughput Data Exfiltration Over the DNS Protocol

Author: Aminov Avi
Nadler Asaf
Shabtai Asaf
Publication venue
Publication date: 17/06/2018
Field of study

In the presence of security countermeasures, a malware designed for data exfiltration must do so using a covert channel to achieve its goal. Among existing covert channels stands the domain name system (DNS) protocol. Although the detection of covert channels over the DNS has been thoroughly studied in the last decade, previous research dealt with a specific subclass of covert channels, namely DNS tunneling. While the importance of tunneling detection is not undermined, an entire class of low throughput DNS exfiltration malware remained overlooked. The goal of this study is to propose a method for detecting both tunneling and low-throughput data exfiltration over the DNS. Towards this end, we propose a solution composed of a supervised feature selection method, and an interchangeable, and adjustable anomaly detection model trained on legitimate traffic. In the first step, a one-class classifier is applied for detecting domain-specific traffic that does not conform with the normal behavior. Then, in the second step, in order to reduce the false positive rate resulting from the attempt to detect the low-throughput data exfiltration we apply a rule-based filter that filters data exchange over DNS used by legitimate services. Our solution was evaluated on a medium-scale recursive DNS server logs, and involved more than 75,000 legitimate uses and almost 2,000 attacks. Evaluation results shows that while DNS tunneling is covered with at least 99% recall rate and less than 0.01% false positive rate, the detection of low throughput exfiltration is more difficult. While not preventing it completely, our solution limits a malware attempting to avoid detection with at most a 1kb/h of payload under the limitations of the DNS syntax (equivalent to five credit cards details, or ten user credentials per hour) which reduces the effectiveness of the attack.Comment: 5 figs. 7 table

arXiv.org e-Print Archive

MaskDGA: A Black-box Evasion Technique Against DGA Classifiers and Adversarial Defenses

Author: Nadler Asaf
Shabtai Asaf
Sidi Lior
Publication venue
Publication date: 24/02/2019
Field of study

Domain generation algorithms (DGAs) are commonly used by botnets to generate domain names through which bots can establish a resilient communication channel with their command and control servers. Recent publications presented deep learning, character-level classifiers that are able to detect algorithmically generated domain (AGD) names with high accuracy, and correspondingly, significantly reduce the effectiveness of DGAs for botnet communication. In this paper we present MaskDGA, a practical adversarial learning technique that adds perturbation to the character-level representation of algorithmically generated domain names in order to evade DGA classifiers, without the attacker having any knowledge about the DGA classifier's architecture and parameters. MaskDGA was evaluated using the DMD-2018 dataset of AGD names and four recently published DGA classifiers, in which the average F1-score of the classifiers degrades from 0.977 to 0.495 when applying the evasion technique. An additional evaluation was conducted using the same classifiers but with adversarial defenses implemented: adversarial re-training and distillation. The results of this evaluation show that MaskDGA can be used for improving the robustness of the character-level DGA classifiers against adversarial attacks, but that ideally DGA classifiers should incorporate additional features alongside character-level features that are demonstrated in this study to be vulnerable to adversarial attacks.Comment: 12 pages, 2 figure

arXiv.org e-Print Archive

Detecting Cyberattacks in Industrial Control Systems Using Convolutional Neural Networks

Author: Kravchik Moshe
Shabtai Asaf
Publication venue
Publication date: 10/12/2018
Field of study

This paper presents a study on detecting cyberattacks on industrial control systems (ICS) using unsupervised deep neural networks, specifically, convolutional neural networks. The study was performed on a SecureWater Treatment testbed (SWaT) dataset, which represents a scaled-down version of a real-world industrial water treatment plant. e suggest a method for anomaly detection based on measuring the statistical deviation of the predicted value from the observed value.We applied the proposed method by using a variety of deep neural networks architectures including different variants of convolutional and recurrent networks. The test dataset from SWaT included 36 different cyberattacks. The proposed method successfully detects the vast majority of the attacks with a low false positive rate thus improving on previous works based on this data set. The results of the study show that 1D convolutional networks can be successfully applied to anomaly detection in industrial control systems and outperform more complex recurrent networks while being much smaller and faster to train.Comment: Proceedings of the 2018 Workshop on Cyber-Physical Systems Security and PrivaC

arXiv.org e-Print Archive

Deployment Optimization of IoT Devices through Attack Graph Analysis

Author: Agmon Noga
Puzis Rami
Shabtai Asaf
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/04/2019
Field of study

The Internet of things (IoT) has become an integral part of our life at both work and home. However, these IoT devices are prone to vulnerability exploits due to their low cost, low resources, the diversity of vendors, and proprietary firmware. Moreover, short range communication protocols (e.g., Bluetooth or ZigBee) open additional opportunities for the lateral movement of an attacker within an organization. Thus, the type and location of IoT devices may significantly change the level of network security of the organizational network. In this paper, we quantify the level of network security based on an augmented attack graph analysis that accounts for the physical location of IoT devices and their communication capabilities. We use the depth-first branch and bound (DFBnB) heuristic search algorithm to solve two optimization problems: Full Deployment with Minimal Risk (FDMR) and Maximal Utility without Risk Deterioration (MURD). An admissible heuristic is proposed to accelerate the search. The proposed method is evaluated using a real network with simulated deployment of IoT devices. The results demonstrate (1) the contribution of the augmented attack graphs to quantifying the impact of IoT devices deployed within the organization on security, and (2) the effectiveness of the optimized IoT deployment

arXiv.org e-Print Archive

MDGAN: Boosting Anomaly Detection Using \\Multi-Discriminator Generative Adversarial Networks

Author: Intrator Yotam
Katz Gilad
Shabtai Asaf
Publication venue
Publication date: 11/10/2018
Field of study

Anomaly detection is often considered a challenging field of machine learning due to the difficulty of obtaining anomalous samples for training and the need to obtain a sufficient amount of training data. In recent years, autoencoders have been shown to be effective anomaly detectors that train only on "normal" data. Generative adversarial networks (GANs) have been used to generate additional training samples for classifiers, thus making them more accurate and robust. However, in anomaly detection GANs are only used to reconstruct existing samples rather than to generate additional ones. This stems both from the small amount and lack of diversity of anomalous data in most domains. In this study we propose MDGAN, a novel GAN architecture for improving anomaly detection through the generation of additional samples. Our approach uses two discriminators: a dense network for determining whether the generated samples are of sufficient quality (i.e., valid) and an autoencoder that serves as an anomaly detector. MDGAN enables us to reconcile two conflicting goals: 1) generate high-quality samples that can fool the first discriminator, and 2) generate samples that can eventually be effectively reconstructed by the second discriminator, thus improving its performance. Empirical evaluation on a diverse set of datasets demonstrates the merits of our approach

arXiv.org e-Print Archive

Content-based data leakage detection using extended fingerprinting

Author: Shabtai Asaf
Shapira Bracha
Shapira Yuri
Publication venue
Publication date: 08/02/2013
Field of study

Protecting sensitive information from unauthorized disclosure is a major concern of every organization. As an organizations employees need to access such information in order to carry out their daily work, data leakage detection is both an essential and challenging task. Whether caused by malicious intent or an inadvertent mistake, data loss can result in significant damage to the organization. Fingerprinting is a content-based method used for detecting data leakage. In fingerprinting, signatures of known confidential content are extracted and matched with outgoing content in order to detect leakage of sensitive content. Existing fingerprinting methods, however, suffer from two major limitations. First, fingerprinting can be bypassed by rephrasing (or minor modification) of the confidential content, and second, usually the whole content of document is fingerprinted (including non-confidential parts), resulting in false alarms. In this paper we propose an extension to the fingerprinting approach that is based on sorted k-skip-n-grams. The proposed method is able to produce a fingerprint of the core confidential content which ignores non-relevant (non-confidential) sections. In addition, the proposed fingerprint method is more robust to rephrasing and can also be used to detect a previously unseen confidential document and therefore provide better detection of intentional leakage incidents

arXiv.org e-Print Archive

Analysis of Location Data Leakage in the Internet Traffic of Android-based Mobile Devices

Author: Bitton Ron
Shabtai Asaf
Sivan Nir
Publication venue
Publication date: 12/12/2018
Field of study

In recent years we have witnessed a shift towards personalized, context-based applications and services for mobile device users. A key component of many of these services is the ability to infer the current location and predict the future location of users based on location sensors embedded in the devices. Such knowledge enables service providers to present relevant and timely offers to their users and better manage traffic congestion control, thus increasing customer satisfaction and engagement. However, such services suffer from location data leakage which has become one of today's most concerning privacy issues for smartphone users. In this paper we focus specifically on location data that is exposed by Android applications via Internet network traffic in plaintext (i.e., without encryption) without the user's awareness. We present an empirical evaluation, involving the network traffic of real mobile device users, aimed at: (1) measuring the extent of location data leakage in the Internet traffic of Android-based smartphone devices; and (2) understanding the value of this data by inferring users' points of interests (POIs). This was achieved by analyzing the Internet traffic recorded from the smartphones of a group of 71 participants for an average period of 37 days. We also propose a procedure for mining and filtering location data from raw network traffic and utilize geolocation clustering methods to infer users' POIs. The key findings of this research center on the extent of this phenomenon in terms of both ubiquity and severity; we found that over 85\% of devices of users are leaking location data, and the exposure rate of users' POIs, derived from the relatively sparse leakage indicators, is around 61%.Comment: 11 pages, 10 figure

arXiv.org e-Print Archive

Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection

Author: Doitshman Tomer
Elovici Yuval
Mirsky Yisroel
Shabtai Asaf
Publication venue
Publication date: 27/05/2018
Field of study

Neural networks have become an increasingly popular solution for network intrusion detection systems (NIDS). Their capability of learning complex patterns and behaviors make them a suitable solution for differentiating between normal traffic and network attacks. However, a drawback of neural networks is the amount of resources needed to train them. Many network gateways and routers devices, which could potentially host an NIDS, simply do not have the memory or processing power to train and sometimes even execute such models. More importantly, the existing neural network solutions are trained in a supervised manner. Meaning that an expert must label the network traffic and update the model manually from time to time. In this paper, we present Kitsune: a plug and play NIDS which can learn to detect attacks on the local network, without supervision, and in an efficient online manner. Kitsune's core algorithm (KitNET) uses an ensemble of neural networks called autoencoders to collectively differentiate between normal and abnormal traffic patterns. KitNET is supported by a feature extraction framework which efficiently tracks the patterns of every network channel. Our evaluations show that Kitsune can detect various attacks with a performance comparable to offline anomaly detectors, even on a Raspberry PI. This demonstrates that Kitsune can be a practical and economic NIDS.Comment: Appears in Network and Distributed Systems Security Symposium (NDSS) 201

arXiv.org e-Print Archive

Can't Boil This Frog: Robustness of Online-Trained Autoencoder-Based Anomaly Detectors to Adversarial Poisoning Attacks

Author: Kravchik Moshe
Shabtai Asaf
Publication venue
Publication date: 07/02/2020
Field of study

In recent years, a variety of effective neural network-based methods for anomaly and cyber attack detection in industrial control systems (ICSs) have been demonstrated in the literature. Given their successful implementation and widespread use, there is a need to study adversarial attacks on such detection methods to better protect the systems that depend upon them. The extensive research performed on adversarial attacks on image and malware classification has little relevance to the physical system state prediction domain, which most of the ICS attack detection systems belong to. Moreover, such detection systems are typically retrained using new data collected from the monitored system, thus the threat of adversarial data poisoning is significant, however this threat has not yet been addressed by the research community. In this paper, we present the first study focused on poisoning attacks on online-trained autoencoder-based attack detectors. We propose two algorithms for generating poison samples, an interpolation-based algorithm and a back-gradient optimization-based algorithm, which we evaluate on both synthetic and real-world ICS data. We demonstrate that the proposed algorithms can generate poison samples that cause the target attack to go undetected by the autoencoder detector, however the ability to poison the detector is limited to a small set of attack types and magnitudes. When the poison-generating algorithms are applied to the popular SWaT dataset, we show that the autoencoder detector trained on the physical system state data is resilient to poisoning in the face of all ten of the relevant attacks in the dataset. This finding suggests that neural network-based attack detectors used in the cyber-physical domain are more robust to poisoning than in other problem domains, such as malware detection and image processing

arXiv.org e-Print Archive

Query-Efficient Black-Box Attack Against Sequence-Based Malware Classifiers

Author: Elovici Yuval
Rokach Lior
Rosenberg Ishai
Shabtai Asaf
Publication venue
Publication date: 03/10/2020
Field of study

In this paper, we present a generic, query-efficient black-box attack against API call-based machine learning malware classifiers. We generate adversarial examples by modifying the malware's API call sequences and non-sequential features (printable strings), and these adversarial examples will be misclassified by the target malware classifier without affecting the malware's functionality. In contrast to previous studies, our attack minimizes the number of malware classifier queries required. In addition, in our attack, the attacker must only know the class predicted by the malware classifier; attacker knowledge of the malware classifier's confidence score is optional. We evaluate the attack effectiveness when attacks are performed against a variety of malware classifier architectures, including recurrent neural network (RNN) variants, deep neural networks, support vector machines, and gradient boosted decision trees. Our attack success rate is around 98% when the classifier's confidence score is known and 64% when just the classifier's predicted class is known. We implement four state-of-the-art query-efficient attacks and show that our attack requires fewer queries and less knowledge about the attacked model's architecture than other existing query-efficient attacks, making it practical for attacking cloud-based malware classifiers at a minimal cost.Comment: Accepted as a conference paper at ACSAC 202

arXiv.org e-Print Archive